complex behavior
Autonomous Reinforcement Learning via Subgoal Curricula
Reinforcement learning (RL) promises to enable autonomous acquisition of complex behaviors for diverse agents. However, the success of current reinforcement learning algorithms is predicated on an often under-emphasised requirement -- each trial needs to start from a fixed initial state distribution. Unfortunately, resetting the environment to its initial state after each trial requires substantial amount of human supervision and extensive instrumentation of the environment which defeats the goal of autonomous acquisition of complex behaviors. In this work, we propose Value-accelerated Persistent Reinforcement Learning (VaPRL), which generates a curriculum of initial states such that the agent can bootstrap on the success of easier tasks to efficiently learn harder tasks. The agent also learns to reach the initial states proposed by the curriculum, minimizing the reliance on human interventions into the learning. We observe that VaPRL reduces the interventions required by three orders of magnitude compared to episodic RL while outperforming prior state-of-the art methods for reset-free RL both in terms of sample efficiency and asymptotic performance on a variety of simulated robotics problems.
Autonomous Reinforcement Learning via Subgoal Curricula
Reinforcement learning (RL) promises to enable autonomous acquisition of complex behaviors for diverse agents. However, the success of current reinforcement learning algorithms is predicated on an often under-emphasised requirement -- each trial needs to start from a fixed initial state distribution. Unfortunately, resetting the environment to its initial state after each trial requires substantial amount of human supervision and extensive instrumentation of the environment which defeats the goal of autonomous acquisition of complex behaviors. In this work, we propose Value-accelerated Persistent Reinforcement Learning (VaPRL), which generates a curriculum of initial states such that the agent can bootstrap on the success of easier tasks to efficiently learn harder tasks. The agent also learns to reach the initial states proposed by the curriculum, minimizing the reliance on human interventions into the learning. We observe that VaPRL reduces the interventions required by three orders of magnitude compared to episodic RL while outperforming prior state-of-the art methods for reset-free RL both in terms of sample efficiency and asymptotic performance on a variety of simulated robotics problems.
No-brainer: Morphological Computation driven Adaptive Behavior in Soft Robots
It is prevalent in contemporary AI and robotics to separately postulate a brain modeled by neural networks and employ it to learn intelligent and adaptive behavior. While this method has worked very well for many types of tasks, it isn't the only type of intelligence that exists in nature. In this work, we study the ways in which intelligent behavior can be created without a separate and explicit brain for robot control, but rather solely as a result of the computation occurring within the physical body of a robot. Specifically, we show that adaptive and complex behavior can be created in voxel-based virtual soft robots by using simple reactive materials that actively change the shape of the robot, and thus its behavior, under different environmental cues. We demonstrate a proof of concept for the idea of closed-loop morphological computation, and show that in our implementation, it enables behavior mimicking logic gates, enabling us to demonstrate how such behaviors may be combined to build up more complex collective behaviors. Keywords: Soft robotics Adaptive behavior 1 Introduction and Background Recent advances in artificial intelligence and machine learning have benefited greatly from the rise of modern deep learning systems, ultimately aimed at artificial general intelligence [22]. The coming-of-age of these artificial neural network systems includes a long history of bio-inspiration, dating back to Mcculloch and Pitts [26]. Yet the processes behind biological intelligence reach far beyond systems and processes confined to the brain of living organisms. Our bias toward attributing intelligent behavior to the mind is far from new.
- North America > United States > Vermont > Chittenden County > Burlington (0.14)
- North America > Mexico > Quintana Roo > Cancún (0.04)
- North America > United States > Indiana > Marion County > Indianapolis (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
Predicting Complex Behavior in Sparse Asymmetric Networks
Recurrent networks of threshold elements have been studied inten(cid:173) sively as associative memories and pattern-recognition devices. While most research has concentrated on fully-connected symmetric net(cid:173) works. These net(cid:173) works can show fixed-point. The approach also provides qualitative insight into why the system behaves as it does and suggests possible applications.
Watch out, Messi: artificial intelligence has finally learned to play football
DeepMind, Google's artificial intelligence division, taught AI humanoids how to work as a team in order to play football together, turning them from flailing tots to proficient players. Researchers ran a computer simulation through an athletic curriculum, giving AI control over humanoids with realistic body masses and movements. It's not the first time DeepMind tried its hand at games. The AI previously mastered chess and Go, a feat that researchers thought was nigh impossible at one point. Then, the group focused on other games, like Mario or Starcraft.
Why AI Needs a Genome - Issue 108: Change
It's Monday morning of some week in 2050 and you're shuffling into your kitchen, drawn by the smell of fresh coffee C-3PO has brewed while he unloaded the dishwasher. "Here you go, Han Solo, I used the new flavor you bought yesterday," C-3PO tells you as he hands you the cup. C-3PO arrived barely a month ago and already has developed a wonderful sense of humor and even some snark. He isn't the real C-3PO, of course--you just named him that because you are a vintage movie buff--but he's the latest NeuroCyber model that comes closest to how people think, talk, and acquire knowledge. He's no match to the original C-3PO's fluency in 6 million forms of communication, but he's got full linguistic mastery and can learn from humans like humans do--from observation and imitation, whether it's using sarcasm or sticking dishes into slots. Unlike the early models of such assistants like Siri or Alexa who could recognize commands and act upon them, NeuroCybers can evolve into intuitive assistants and companions.
- North America > Canada > Quebec > Montreal (0.14)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Leisure & Entertainment (0.47)
- Education (0.47)
- Health & Medicine > Therapeutic Area (0.35)
The Math of the Amazing Sandpile - Issue 107: The Edge
One country going Communist was supposed to topple the next, and then the next, and the next. The metaphor drove much of United States foreign policy in the middle of the 20th century. But it had the wrong name. From a physical point of view, it should have been called the "sandpile theory." Real-world political phase transitions tend to happen not in neat sequences, but in sudden coordinated fits, like the Arab Spring, or the collapse of the Eastern Bloc.
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- Asia > Middle East > Jordan (0.04)
Five lines of code could change the way we think about AI
Most artificial intelligence systems can be advanced by adding more: more computing power, more lines of code, more analysis, more neural networks, more machine learning. And this is great if you have large amounts of power and space at your disposal, like on a car, or a rocket ship, or in a data center. But if you don't have all that? Then you have to get simple and think creatively. That's what Johannes Overvelde and his team at AMOLF, a government-funded Dutch physics research institute, did in a new study released this week in Proceedings of the National Academy of Sciences.
John Conway, inventor of the Game of Life, has died of COVID-19
Princeton mathematician John Conway has died of the coronavirus. He was 82 years old. The British-born Conway spent the early part of his career at Cambridge before moving to Princeton University in the 1980s. He made contributions in various areas of mathematics but is best known for his invention of Conway's Game of Life, a cellular automaton in which simple rules give rise to surprisingly complex behaviors. It was made famous by a 1970 Scientific American article and has had a lively community around it ever since then.
Machine learning reveals links between genetic factors and behavior
Researchers at the University of Utah Health have used machine learning to start making links between seemingly instinctive, random behaviors and the genetic factors that shape such behaviors. Using machine learning to study mice with differences in their genetics and age, the team found that these differences influenced the behavioral sequences the animals expressed while they foraged for food. The researchers believe the methodology could one day be applied to help understand the genomic elements that may shape complex behaviors in humans, including those that lead to disease or psychiatric disorders. Patterns of complex behavior, like searching for food, are composed of sequences that feel random, spontaneous and free. Using machine learning, we are finding discrete sequences that are reproduced more frequently than you would expect by chance and these sequences are rooted in biology." Gregg and colleagues are venturing into what has previously been considered a controversial new territory called behavioral sequencing. The aim is to understand the architecture of complex behavior and how genetics shape these patterns. The concerns surrounding behavioral genetics research are based on fears that it could lead to eugenic policies. Literally meaning "well-born," eugenics refers to the improvement of humanity using scientific methods such as selective breeding. As outlined by the Nuffield Council on Bioethics, the use of "negative eugenics" has led to some of the worst atrocities in recent history such as the segregation and sterilization of hundreds of thousands of people in the United States and Europe. However, members of the council point out that contemporary research into the area is not necessarily pursuing eugenics-based goals and that the devastating events that have occurred in the past could be learned from to prevent such abuse in the future. The council acknowledges that there are certain concerns that need to be addressed if research into the field is going to be encouraged. Defining and measuring behaviors can be challenging and there is a risk of misinterpreting or misapplying statistical estimates of heritability. Other concerns include the lack of replicated findings and difficulties in predicting how behavior develops, given how complex the interaction between genes and the environment is. However, the council concludes that despite these concerns, identifying and investigating the genes that influence behavior is still practicable and worthwhile. "There are currently no practical applications of research in the genetics of behavior within the normal range.